Sed 4415 optimize reporting request and query performance#566
Sed 4415 optimize reporting request and query performance#566david-stephan wants to merge 10 commits into29from
Conversation
This reverts commit eb71b1f.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
...ntroller/step-controller-server/src/main/java/step/core/controller/StepControllerPlugin.java
Show resolved
Hide resolved
| AsyncTaskManager asyncTaskManager = context.require(AsyncTaskManager.class); | ||
| asyncTaskManager.scheduleAsyncTask((empty) -> { | ||
| logger.info("ReportNode timeSeries ingestion for empty resolutions has started"); | ||
| reportNodeTimeSeries.getTimeSeries().ingestDataForEmptyCollections(); |
There was a problem hiding this comment.
- I couldn't find the place where this was called previously. Wasn't this called?
- Is it safe to do it asynchronously? I assume the controller would start and executions could start creating data too?
There was a problem hiding this comment.
We have the same logic implemented for the time-series (response times) but it did not exists for the reportNodeTimeSeries.
Reinsgesting such new resolutions can take quite some time and I would say doing it async is the only option. Since empty resolutions are determined before any new execution can be triggered and a dedicated ingestion pipeline is used it should be safe However the filter currently used to re-ingest data it too permissive and could cause to re-ingest "new" buckets (created while re-ingesting). I will update the filter to make sure begin < controller_start_time (in ingestDataForEmptyCollections)
Filter filter = (Filter)(collection.getTtl() > 0L ? Filters.gte("begin", System.currentTimeMillis() - collection.getTtl()) : Filters.empty()); try (Stream<Bucket> bucketStream = previousCollection.findLazy(filter, searchOrder)) {
There was a problem hiding this comment.
Thanks for the clarification. Agree that it should be done asynchronously. There's however a few things we should think about:
- Ensure the re-ingestion is not interrupted: We could explicitly write in the logs that the controller shouldn't be restarted during the re-ingestion to avoid incomplete re-ingestions.
- Ensure new data are not re-ingested: not sure if the condition begin < controller_start_time would be enough for large resolution. In theory the last bucket could fulfill this condition and be used for new executions while it is being re-ingested. Right?
- In any case, we should benchmark the re-ingestion for large data sets
|
|
||
| @Override | ||
| public void runUpgradeScript() { | ||
| log.info("Renaming time-series 'main' collections to match their resolutions"); |
There was a problem hiding this comment.
I was about to write that we should have only one main collection and understood later that we have 2 timeseries. We might explicitly list the 2 collection names
| private void updateSettingKeyIfPresent(String oldKey, String newKey) { | ||
| Optional<Document> setting = settings.find(Filters.equals("key", oldKey), null, null, null, 0).findFirst(); | ||
| setting.ifPresent(s -> { | ||
| s.put("key", newKey); |
There was a problem hiding this comment.
I would add an explicit log entry here to be fully transparent
...ep-plans-core/src/main/java/step/core/artefacts/reports/aggregated/ReportNodeTimeSeries.java
Show resolved
Hide resolved
| public enum Resolution { | ||
| FIVE_SECONDS("5_seconds", Duration.ofSeconds(5), Duration.ofSeconds(1), false), | ||
| FIFTEEN_SECONDS("15_seconds", Duration.ofSeconds(15), Duration.ofSeconds(5), false), | ||
| ONE_MINUTE("minute", Duration.ofMinutes(1), Duration.ofSeconds(10), false), |
There was a problem hiding this comment.
if we rename the main collections, shouldn't we rename to 1_minute to be consistent?
There was a problem hiding this comment.
I wanted to limit the migrations, but you're right
There was a problem hiding this comment.
If this is too much of effort (including FE), we can also do it for the next major
| return enabledCollections; | ||
| } | ||
|
|
||
| public List<TimeSeriesCollection> getSingleTimeSeriesCollections(String mainCollectionName, TimeSeriesCollectionsSettings collectionsSettings, Duration resolution, Long flushInterval) { |
There was a problem hiding this comment.
Doesn't seem to be used
| List<TimeSeriesCollection> enabledCollections = new ArrayList<>(); | ||
| int flushSeriesQueueSize = collectionsSettings.getFlushSeriesQueueSize(); | ||
| int flushAsyncQueueSize = collectionsSettings.getFlushAsyncQueueSize(); | ||
| addIfEnabled(enabledCollections, mainCollectionName, |
There was a problem hiding this comment.
If for some reason the settings of a time series don't exist, the whole collection is drop in addIfEnabled. Isn't this a bit risky? should we foresee an explicit cleanup flag or something similar to avoid unexpected drops of the main collection?
There was a problem hiding this comment.
I think I'll completely remove the drop when the collection is disabled. Admins can always manually drop the collection in DB if required.
There was a problem hiding this comment.
Exactly. I think it is safer
No description provided.